Tootfinder

@arXiv_statML_bot@mastoxiv.page
2024-05-08 07:24:57

Federated Control in Markov Decision Processes
Hao Jin, Yang Peng, Liangyu Zhang, Zhihua Zhang
https://arxiv.org/abs/2405.04026 https://

Federated Control in Markov Decision Processes
We study problems of federated control in Markov Decision Processes. To solve an MDP with large state space, multiple learning agents are introduced to collaboratively learn its optimal policy without communication of locally collected experience. In our settings, these agents have limited capabilities, which means they are restricted within different regions of the overall state space during the training process. In face of the difference among restricted regions, we firstly introduce concepts…

@arXiv_csGT_bot@mastoxiv.page
2024-05-07 08:45:38

This https://arxiv.org/abs/2302.08108 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csGT_…

User Response in Ad Auctions: An MDP Formulation of Long-Term Revenue Optimization
We propose a new Markov Decision Process (MDP) model for ad auctions to capture the user response to the quality of ads, with the objective of maximizing the long-term discounted revenue. By incorporating user response, our model takes into consideration all three parties involved in the auction (advertiser, auctioneer, and user). The state of the user is modeled as a user-specific click-through rate (CTR) with the CTR changing in the next round according to the set of ads shown to the user in …

@arXiv_csLG_bot@mastoxiv.page
2024-02-19 06:52:04

Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning
Zihao Li, Boyi Liu, Zhuoran Yang, Zhaoran Wang, Mengdi Wang
https://arxiv.org/abs/2402.10810

Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning
We study the Constrained Convex Markov Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure, subject to a convex constraint. Designing algorithms for a constrained convex MDP faces several challenges, including (1) handling the large state space, (2) managing the exploration/exploitation tradeoff, and (3) solving the constrained optimization where the objective and the constraint are both nonlinear functions of the visitation measure. In this work,…

@arXiv_csGT_bot@mastoxiv.page
2024-03-08 06:49:58

RL-CFR: Improving Action Abstraction for Imperfect Information Extensive-Form Games with Reinforcement Learning
Boning Li, Zhixuan Fang, Longbo Huang
https://arxiv.org/abs/2403.04344

RL-CFR: Improving Action Abstraction for Imperfect Information Extensive-Form Games with Reinforcement Learning
Effective action abstraction is crucial in tackling challenges associated with large action spaces in Imperfect Information Extensive-Form Games (IIEFGs). However, due to the vast state space and computational complexity in IIEFGs, existing methods often rely on fixed abstractions, resulting in sub-optimal performance. In response, we introduce RL-CFR, a novel reinforcement learning (RL) approach for dynamic action abstraction. RL-CFR builds upon our innovative Markov Decision Process (MDP) for…

@arXiv_eessSY_bot@mastoxiv.page
2024-03-26 07:01:54

Fisher Information Approach for Masking the Sensing Plan: Applications in Multifunction Radars
Shashwat Jain, Vikram Krishnamurthy, Muralidhar Rangaswamy, Bosung Kang, Sandeep Gogineni
https://arxiv.org/abs/2403.15966

Fisher Information Approach for Masking the Sensing Plan: Applications in Multifunction Radars
How to design a Markov Decision Process (MDP) based radar controller that makes small sacrifices in performance to mask its sensing plan from an adversary? The radar controller purposefully minimizes the Fisher information of its emissions so that an adversary cannot identify the controller's model parameters accurately. Unlike classical open loop statistical inference, where the Fisher information serves as a lower bound for the achievable covariance, this paper employs the Fisher information …

Tootfinder

Opt-in global Mastodon full text search. Join the index!